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ABSTRACT 

Tuberculosis (TB) also referred as Phthisis Pulmonalis is a contagious disease caused by Bacillus Mycobacterium Tuberculosis that 
affects the lungs. But when left untreated, the infection spreads through the bloodstream and affects the bones, liver and kidneys. 
The bacteria which cause Tuberculosis infection are communicable, when people with Active Tuberculosis infection sneeze, cough or 
transmit respiratory fluids in the air. There are several tests to diagnose the Tuberculosis bacterial infection. But, the standard tests 
are slow, less accurate and more expensive. This paper details an automated approach developed by the author, for Tuberculosis 
diagnosis which is more accurate and less expensive. The automated approach makes use of Chest radiography to diagnose the 
disease. The lung region is extracted using Graph cut lung segmentation method for identifying the ribs and clavicles, which are 
needed for the diagnosis. The Graph cut lung segmentation method provides better accuracy and then the Classification is 
performed between normal and abnormal X-ray patterns. Finally, performance of the method is analyzed. 


Index Terms:- Computer Aided Diagnosis, Purified Protein Derivative, Mantoux test, Lung segmentation, Image processing, 
Postero-anterior radiograph. 


1. INTRODUCTION Tuberculosis. The symptoms of Active Tuberculosis infection 
The people, who are affected by Tuberculosis infection, do not — includes chronic with blood-tinged sputum and weight loss. If 
have any symptoms until the immune system weakens. When _ the Active Tuberculosis infection is not diagnosed periodically, it 
the immune system weakens, the Tuberculosis bacteria cause can be fatal. In the year 2014, according to World Health 
death of tissue in the organs and it is referred as Active Organisation survey on Tuberculosis disease, 9.6 million people 
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was affected by Tuberculosis infection and 1.5 million people 


died due to the disease [10]. The people who are infected by 
HIV more often gets infected by Active Tuberculosis infection. 
People who consume Tobacco and attribute to smoking are 
more likely to have the risk of Tuberculosis infection. More than 
20% 
smoking [1]. The mortality rates are high, when the patients with 


of Tuberculosis cases worldwide are attributable to 


Active Tuberculosis are not periodically diagnosed. Thus it is 
necessary to provide periodical diagnosis for the patients 
infected by Tuberculosis infection and antibiotics treatment is 
provided to increase the chance of survival. Several tests exist 
for diagnosing the Tuberculosis infection. The standard test to 
diagnose Tuberculosis infection is using clinical soutum sample, 
which takes several months to diagnose the growth of 
Mycobacterium Tuberculosis. Sputum smear microscopy is a 
technique to diagnosis Pulmonary Tuberculosis infection, in 
which sputum samples are observed under a microscope. It is a 
simple and inexpensive technique which involves diagnosing 
the infectious patients. Soutum smear microscopy has certain 
limitations in its performance. Mantoux test is a used to 
diagnose Latent Tuberculosis, which involves injecting Purified 
Protein Derivative (PPD) Tuberculin into the skin. It is also called 
as Tuberculin Skin Test. The result of the test depends on the 
size of swelling caused in the skin. But the Tuberculosis Skin 
Test is slow and not reliable test for diagnose of Tuberculosis 
infection. 

Serological test are carried out on the sample of blood, and 
they claim to diagnose Tuberculosis by detecting antibodies in 
the blood. The World Health Organisation has warned not to 
conduct Serological test and diagnose Active Tuberculosis. So, 
some Countries have banned the use of Serological tests for 
Tuberculosis diagnosis. The standard tests are slow, less 
accurate and more expensive. Thus Computer aided approach is 
developed to diagnose the Tuberculosis infection. In this paper, 
an automated approach is developed to diagnose Tuberculosis 
manifestations in Chest radiographs. The Chest radiography is 
ubiquitous radiological investigation which includes the breast 
anatomy and an important step for Tuberculosis diagnosis due 
to the low cost. This paper is organized as follows. Section Il, 
briefly summarizes the related work. Section Ill, provides the 
description of the System overview. Section IV, describes the 
proposed framework in detail. Section V, presents the 
Experimental results and analysis. Conclusions and future work 
are given in Section VI. 


2. RELATED WORK 

Chest radiography plays an important role in the detection and 
diagnosis of the disease related to lungs. The Chest radiograph 
specifies the thoracic anatomy and provides high yield, at the 
low cost [1]. There are some challenges in processing Chest X- 
ray images. For example, in lung Segmentation, the strong 


edges at the rib cage and clavicle region cause local minima for 
most minimization approaches. Segmenting the lung apex is 
also a nontrivial problem because of the changing intensity at 
the clavicle bone [18]. Examples of normal Chest X-ray and 
abnormal Chest X-ray ie, with and without Tuberculosis 
infection are shown in Figure 1 and Figure 2 respectively. 


Figure 1 Normal Chest X-ray 


Se By | 


Figure 2 Abnormal Chest X-ray; (A) Chest X-ray A, represents 


Signs of Tuberculosis; (B) Chest X-ray B, represents Cavitary 
Tuberculosis and; (C) Chest X-ray C, represents Pulmonary 
Tuberculosis 


Computer Aided Diagnosis (CAD) has been popular for 
Tuberculosis detection and lung disease classification. Detecting 
the lung regions in Chest X-ray images is an important step of 
Computer Aided Diagnosis applications such as Tuberculosis or 
Pneumoconiosis screening. Computer Aided Diagnosis (CAD) 
scheme for detecting lung nodules in Chest radiograph consist 
of three major steps: 1) Segmentation of Lung field based; 2) 
Feature analysis and 3) Classification of the nodule candidates 
into nodules or non-nodules by use of a nonlinear Support 
Vector Machine (SVM) classifier. Automatic Segmentation of 
anatomical fields is the first steps in Computer Aided Systems. 
Some of the diagnostic information can be extracted from the 
anatomical boundaries such as Total Lung Capacity which aids 
in detection of Pneumonia, Pulmonary Atelectasis or Obstructive 
Airways diseases. Tuberculosis classification needs anatomical 
boundaries for the classification of further stages [9]. There are 
several anatomical challenges in Segmenting the lung region 
which involves segmenting the lung apex, because of the 
varying intensities in the upper clavicle bone region. Other 
Challenges includes segmenting the costophrenic angle and X- 
ray image in-homogeneitics [2]. Segmentation methods for 
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segmenting the chest radiograph are classified as (i) rule based 


methods (ii) pixel classification-based methods (iii) deformable 
model based methods and (iv) hybrid methods. 

In rule based method, segmentation is based on certain 
rules such as Threshold operations. These methods have mostly 
heuristic assumptions and compute approximate solutions that 
can be far from the global optimum. Therefore, rule based 
methods are generally used as an initialization stage of more 
robust segmentation algorithms. Pixel classification-based 
methods are more general than rule-based segmentation 
methods [3]. The segmentation in pixel classification-based 
method is based on the feature vector for each pixel in the input 
image and the intensities of the lung region. Deformable 
models are used in medical image segmentation due to its 
shape flexibility. Active Shape Models (ASM) and Active 
Appearance Models (AAM) have been applied to segment the 
lung region. Although Active Shape Models and Active 
Appearance Model approach have become popular for 
biomedical applications, they have several limitations and 
shortcomings including: (i) they can become trapped at local 
minima in Chest x-rays due to high contrast and strong rib cage 
edges, (ii) segmentation performance relies on the accuracy of 
the initial model to the actual boundary, and (iii) they have 
many internal parameters which produces highly variable 
solutions. Hybrid methods aim to produce better results by 
fusing several techniques. In [3], the lung region is extracted 
using a combination of an intensity mask, a lung model mask 
derived from a training set. Fusing deformation-based (Active 
Shape Model, Active Appearance Model) and pixel classification 
methods provides best performance compared to other 
approaches [3]. Thus, hybrid approach is applied to detect, 
register and robustly segment lung organ boundaries across the 
large patient population [3].The lung region is extracted using 
the intensity mask and the long model mask derived from 
training data set. 

An automated nodule detection requires accurate image 
segmentation 1) to measure the metabolic activity of lesions 
after precisely detecting them; 2) to track disease progression 
over time using the Metabolic Tumour Volume (MTV), and the 
amount of lesion information (i.e. total lesion activity); and 3) 
to determine spatial extent of lesions pertaining to Pulmonary 
infections [6]. The 


segmentation process using an objective function in terms of 


Graph cut algorithm, models the 
lung model properties and segments the ribs, heart, and 
clavicles in the Chest radiograph.The Graph cut algorithm 
computes a global binary segmentation by minimizing the 
[18],CandemirS et al. 
Thresholding Method for object extraction process from its 


objective function. In developed 
background by determining whether greater or equal to an 
intensity value T(threshold) for the X-ray image. The pixels are 


classified as object (pixel) or a background (pixel). In general, 


thresholding can be categorized into three categories which are 
global thresholding, local thresholding and dynamic/adaptive 
Chest X-ray 
classification is performed on the extracted information from 


thresholding based on pixel values. image 
the training set. The Classification algorithm is classified as 
supervised and unsupervised based on the sample classes. In 
[12], Shafeena Basheeret et al. developed Principal Component 
Analysis (PCA) as the classification method for classifying the X- 
ray images and it is used as image recognition. The features are 
extracted based on the Scale Invariant Feature Transform (SIFT). 
Scale Invariant Feature Transform is an algorithm to detect and 
describe local features of the image. Principal Component 
which 


represents the full object state and termed as Principal 


Analysis identifies features and the components 
Components. So, Principal Components extracted by Principal 
Component Analysis implicitly present all the features and is a 
mathematical calculation applied on orthogonal transformation 
by converting the set of observations of possibly correlated 
variables into a set of values of linearly uncorrelated variables. 

In [43], Kim Le et al. proposedWatershed segmentation 
method for classification, which requires different types of 
morphological operations and also requires watershed 
transformation in the segmentation process of the lung regions. 
Watershed segmentation used in the partition of lung regions 
.The algorithm is used to adjust the smoothness of regions and 
boundaries. In [32], Osman M.K et al. used neural networks for 
the classification of Pulmonary Tuberculosis. Neural networks 
are effective in the support rules of Tuberculosis diagnosis 
which is developed based on the histo-pathological variables to 
detect the disease. Two models of clustering is used to classify 
the patients. The clustering uses a Fuzzy-Art Neural network 
integrate the fuzzy logic operators and the basic characteristics 
Adaptive Resonance Theory (ART). In [28], Quellec G et al 
developed a Decision tree for the classification of disease. The 
decision tree consists of nodes and branches. Each node 
specifies the decision. The start node is called as root node. The 
tree has branches which provide the result based on the 
conditions provided. The decision tree learning algorithm has 
high transparency and accuracy. In [1], Jaeger S et al. used 
Support Vector Machine as classification algorithm for finding 
the best hyper plane in the input space. The purpose of the 
Support Vector Machine is to find the optimized separator 
function called as classifier. Support Vector Machine (SVM) is 
used to diagnose the presence of Tuberculosis based on Chest 


radiographs. 


3. SYSTEM OVERVIEW 

This section presents the system overview of Tuberculosis 
diagnosis in detail. The developed automated approach for the 
diagnosis of Tuberculosis using Chest radiograph is shown in 
Figure 3. 
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Noise Removal: Median Filter 


Feature Computation 


Figure 3 System Overview: Tuberculosis Diagnosis 


The steps include Chest radiograph Pre-processing, Lung 


segmentation, followed by Feature computation and 
Classification of the input x-ray as normal and abnormal Chest 


radiograph with the presence and absence of Tuberculosis [1]. 


4. THE PROPOSED FRAMEWORK 

This section presents the implemented methods for Lung 
Segmentation, Feature computation, and Classification. The 
system takes Chest X-ray as the input and segments the lung 
region of the input Chest X-ray using Graph cut lung 
segmentation method in combination with a lung model. For 
the segmented lung field, the system computes a set of features 
as input to a pre-trained classifier. Finally, the classifier outputs 
its confidence in classifying the input Chest X-Ray as a 
Tuberculosis positive case or Tuberculosis negative case based 
on the computed features. 


A. Image Pre-Processing 

Image Pre-processing, resample the Chest radiographs by 
increasing the Grey scale contrast and improve Chest X-ray 
image quality. Image pre-processing step involves addition of 
noise and removal of noise to suppress unwanted distortions 


and enhance the feature of the image for further processing. 
The reason for the need of image pre-processing includes: 

e Noise reduction 

e Contrast enhancement 

e Elimination of acquisition-specific artifacts 


B. Graph Cut Lung Segmentation Algorithm 

The Graph cut approach models the lung boundary detection 
based on the object function. To formulate the objective 
function, lung region has to satisfy: a) the lung region should be 
consistent with typical Chest X-ray intensities, b) neighboring 
pixels should have consistent labels, and c) the lung region 
needs to be similar to the lung model are computed. Example of 
lung segmentation in Chest X-ray is shown in Figure.4 


respectively. 


Figure 4 Lung Segmentation 


C. Feature Extraction 

The normal and abnormal Chest X-ray patterns are identified 
based on 1) Object detection Inspired features and 2) Content 
based image retrieval features. 


1) Object Detection Inspired Features 

In object detection inspired features, features are described 
based on the appearance pattern. It is the combination of 
shape, edge, gradient, and texture descriptor. Histogram is 
computed for each descriptor value distributed across the lung 
field. The features of the descriptor form a feature vector and 
the feature vector is used as an input for the classifier. The 
following shape and texture descriptors are used. 


e Intensity Histograms(IH) 
e Gradient Magnitude Histograms(GM) 
e Shape Descriptor Histograms(SD) 


SD = tan7! Ao) 


Where A, and A, are the Eigen values of the Hessian 
Matrix, with A, <A. 
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e Curvature Descriptor Histograms(CD) 


CD = tan-1(¥4*% )(2) 


14+1(x,y) 


With 0 < CD < 1/2,where/(x, y) denotes the pixel intensity for 
the pixel(x, y). 


e Histogram of Oriented Gradients (HOG) Histogram of 
Oriented Gradients is a descriptor for gradient orientations 
based on gradient magnitude. 


2) Content Based Image Retrieval Features 

In Content based image retrieval features, low level features are 
identified. The low level features include edge, shape moments 
and intensity if the Chest radiograph. Content based image 
retrieval features include the following descriptors: 

e Tamura texture descriptor: 

In| Tamura coarseness and 


texture descriptor, contrast, 


directionality features are identified. 


e CEDD and FCTH: 

(Color and Edge Direction Descriptor) and (Fuzzy Color and 
Texture Histogram) incorporate color and texture information in 
the histogram. 


e Hu moments: 
Hu moments are invariant under Chest X-ray scaling, rotation 
and translation. 


e Primitive Length, Edge Frequency and Auto-correlation: 

These are the texture analysis method which uses statistical 
rules to describe the spatial distribution relation with grey 
values. 


D. Classification 

Support Vector Machine is a supervised learning model used as 
Classification algorithm for Tuberculosis diagnosis. Support 
Vector Machine classifies the computed feature vectors into 
either normal or abnormal. It is a supervised non-probabilistic 
classifier that generates hyper planes to separate samples from 
two different classes in a space with possibly infinite dimension. 


5. EXPERIMENTAL RESULTS AND ANALYSIS 

This section presents the practical evaluation of the work. Graph 
cut lung segmentation and feature extraction is performed. The 
Chest X-ray image is taken as an input image and the diagnosis 
of Tuberculosis disease is performed. Chest X-ray image pre- 
processing is an initial step in the process. Pre-processing of the 
Chest radiograph includes addition of noise and removal of the 
noise using Median filter. The addition of noise is based on the 


noise density in Salt and Pepper noise. In Figure 5, addition of 
Salt and Pepper noise is shown. 


Figure 5 Addition of Salt and Pepper noise 


Median Filter is a filtering operation to remove the noise in 
the Chest radiograph and provides better accuracy to the 
radiograph. Median Filter is performed on Chest X-ray to 
remove noise which is shown in Figure 6. 


Figure 6 Median Filter 


The Chest radiograph is segmented using Graph cut lung 
segmentation algorithm. The Graph cut lung segmentation 
algorithm segments the lung region as ribs and clavicles are 
needed as an input for feature extraction. The Segmented Lung 
region is shown in Figure 7. 


Figure 7 Segmented Lung Regions 


Feature extraction is performed to extract the feature of 
Chest radiograph. The features are extracted based on Intensity 
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Gradient 
Descriptor Histogram. Figure 8 represents the Feature Extraction 


Histogram, Magnitude Histogram and Shape 


performed on the lung region. 


Figure 8 Feature Extraction 


Classification is performed using Support Vector Machine to 
diagnose the presence and absence of Tuberculosis disease in 
Chest Radiograph. Classification using Support Vector Machine 
is shown in Figure 9. 


Fedhore Extraction 
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